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SEQUENCE LISTING 

<110> Lassen, Soren Flensted 

<120> Improved proteases and methods for producing them 

<130> 10423 . 204-WO-DK 

<160> 53 

<170> Patentin version 3-2 

<210> 1 
<211> 1062 
<212> DNA 

<213> Nocardiopsis sp. nrrl 18262 
<220> 

<221> mi sc_f eature 
<222> (l)-.(495) 

<223> Mn^ 6S the P r °- re 9 ion shown in positions -165 to -1 of SEQ id 
<220> 

<221> misc_feature 
<222> (496) . . C1059) 

<223> wn^S es the mature region shown in positions 1-188 of seq id 
<400> 1 

gctactggag cattacctca gtctcctaca cctgaagcag atgcagtatc gatgcaagaa 60 
gcattacaac gtgatcttga tcttacatca gctgaagctg aggaattact tgctgcacaa 120 
gatacagcct ttgaagttga tgaagctgcc gctgaagcag ctggtgatgc atatggtggt 180 
tcagtattcg atactgaatc actcgaactt actgtactag tgaccgatgc agcagctgtt 240 
gaagctgttg aagccacagg tgcaggtaca gagctcgtat cttatggtat tgatggatta 300 
gatgagatcg tacaagagct taatgcagct gatgccgttc caggtgtagt tggatggtat 360 
cctgatgtag caggtgatac tgttgtctta gaagttcttg aaggctctgg agctgatgtt 420 
tctggacttt tagcagacgc aggagtcgat gcatccgcgg ttgaagtgac cacgtcagat 480 
cagcctgaac tctatgccga tatcattgga ggcctagcgt acacaatggg tggtcgctgc 540 
agcgtaggat ttgcagccac aaatgcagct ggacaacctg gcttcgtgac agctggacat 600 
tgcggccgcg tcggtacaca ggttactatc ggcaatggaa gaggtgtctt tgagcaaagc 660 
gtatttcccg ggaatgatgc tgccttcgtt agaggtacgt ccaactttac gcttactaac 720 
ttagtatcta gatacaacac tggcggatat gcaactgtag caggtcacaa tcaagcacct 780 
attggctcta gcgtctgccg ctcagggtcg actacaggat ggcattgtgg aaccattcaa 840 
gctagaggtc agagcgtgag ctatcctgaa ggtaccgtaa cgaacatgac tcgtacgact 900 
gtatgtgcag aaccaggtga ctctggaggt tcatatatca gcggtacgca agcgcaaggc 960 
gttacctcag gtggatccgg taactgtagg acaggtggca caacgttcta ccaggaagtg 1020 
acaccgatgg tgaactcttg gggagttaga ctccgtacat aa 1062 
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<210> 2 10423.204-WO.ST25.txt 

<211> 1143 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> "lo^SnSS h8 R D ?o n ? ^ 0Rs ynt-15) encoding a s2a protease denoted 
10R fused by pcr in frame to the signal peptide encodi no 
sequence of a heterologous protease, Savinas? dlng 

<400> 2 

atgaagaaac cgttggggaa aattgtcgca agcaccgcac tactcatttc tgttgctttt 60 
agttcatcga tcgcatcggc tgctactgga gcattacctc agtctcctac acctgaagca 120 
gatgcagtat cgatgcaaga agcattacaa cgtgatcttg atcttacatc agctgaagct 180 
gaggaattac ttgctgcaca agatacagcc tttgaagttg atgaagctgc cgctgaagca 240 
gctggtgatg catatggtgg ttcagtattc gatactgaat cactcgaact tactgtacta 300 
gtgaccgatg cagcagctgt tgaagctgtt gaagccacag gtgcaggtac agagctcgta 360 
tcttatggta ttgatggatt agatgagatc gtacaagagc ttaatgcagc tgatgccgtt 420 
ccaggtgtag ttggatggta tcctgatgta gcaggtgata ctgttgtctt agaagttctt 480 
gaaggctctg gagctgatgt ttctggactt ttagcagacg caggagtcga tgcatccgcg 540 
gttgaagtga ccacgtcaga tcagcctgaa ctctatgccg atatcattgg aggcctagcg 600 
tacacaatgg gtggtcgctg cagcgtagga tttgcagcca caaatgcagc tggacaacct 660 
ggcttcgtga cagctggaca ttgcggccgc gtcggtacac aggttactat cggcaatgga 720 
agaggtgtct ttgagcaaag cgtatttccc gggaatgatg ctgccttcgt tagaggtacg 780 
tccaacttta cgcttactaa cttagtatct agatacaaca ctggcggata tgcaactgta 840 
gcaggtcaca atcaagcacc tattggctct agcgtctgcc gctcagggtc gactacagga 900 
tggcattgtg gaaccattca agctagaggt cagagcgtga gctatcctga aggtaccgta 960 
acgaacatga ctcgtacgac tgtatgtgca gaaccaggtg actctggagg ttcatatatc 1020 
agcggtacgc aagcgcaagg cgttacctca ggtggatccg gtaactgtag gacaggtggc 1080 
acaacgttct accaggaagt gacaccgatg gtgaactctt ggggagttag actccgtaca 1140 
taa 

1143 

<210> 3 
<211> 8 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> liSSSSS? am1n ° aCid tai1 ex P ressed a * ^sion to protease of the 
<400> 3 

Gin ser His Val Gin Ser Ala Pro 
<210> 4 

Page 2 



WO 2004/111219 



PCT/DK2004/000431 



10423 - 204-WO. ST25 . txt 

<211> 24 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Polynucleotide encoding a c-terminal amino acid tail expressed as 
fusion to protease of the invention. 

<400> 4 

caatcgcatg ttcaatccgc tcca 24 

<210> 5 

<211> 4 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> C-terminal amino acid tail expressed as fusion to protease of the 
invention. 

<400> 5 

Gin ser Ala pro 
1 

<210> 6 
<211> 12 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Polynucleotide encoding a C-terminal amino acid tail expressed as 
fusion to protease of the invention. 

<400> 6 

caatcggctc ct 12 

<210> 7 

<211> 2 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> C-terminal amino acid tail expressed as fusion to protease of the 
invention. 

<400> 7 

Gin Pro 
1 

<210> 8 
<211> 6 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Polynucleotide encoding a c-terminal amino acid tail expressed as 
fusion to protease of the invention. 

<400> 8 

caacca c 
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<210> 9 

<211> 1 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> c-terminal amino acid tail expressed as fusion to protease of the 
l nventi on . 

<400> 9 

Pro 
1 



<210> 
<211> 
<212> 



10 
3 

DNA 



<213> Artificial sequence 
<220> 

<223> Polynucleotide encoding a c-terminal amino acid tail expressed as 
fusion to protease of the invention. H 

<400> 10 

cca , 



<210> 
<211> 
<212> 
<213> 

<220> 
<223> 



11 
45 
DNA 

Artificial sequence 
Primer #252639 



<400> 11 

catgtgcatg tgggtaccgc aacgttcgca gatgctgctg aagag 



45 



<210> 12 

<211> 44 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer #251992 

<400> 12 

catgtgcatg tggtcgaccg attatggagc ggattgaaca tgcg 



44 



<210> 
<211> 
<212> 
<213> 

<220> 
<223> 



13 
44 
DNA 

Artificial sequence 
primer #179541 



<400> 13 

gcgttgagac gcgcggccgc gagcgccgtt tggctgaatg atac 



44 



<210> 14 

<211> 43 

<212> DNA 

<213> Artificial sequence 
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<220> 

<223> Primer #179542 
<400> 14 

gcgttgagac agctcgagca gggaaaaatg gaaccgcttt ttc 43 

<210> 15 

<211> 64 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer #179539 

<400> 15 

ccatttgatc agaattcact ggccgtcgtt ttacaaccat tgcggaaaat agtcataggc 60 

atCC 64 

<210> 16 

<211> 60 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer #179540 

<400> 16 

ggatccagat ctggtacccg ggtctagagt cgacgcggcg gttcgcgtcc ggacagcaca 60 

<210> 17 

<211> 37 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> primer #179154 

<400> 17 

gttgtaaaac gacggccagt gaattctgat caaatgg 37 

<210> 18 

<211> 37 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer #179153 

<400> 18 

ccgcgtcgac actagacacg ggtacctgat ctagatc 37 

<210> 19 
<211> 22 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer #317 
<400> 19 

tggcgcaatc ggtaccatgg gg 22 
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<210> 20 
<211> 40 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer #139 NotI 
<400> 20 

catgtgcatg cggccgcatt aacgcgttgc cgcttctgcg 40 

<210> 21 
<211> 7443 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> sequence of plasmid pMBl508 
<400> 21 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 

cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 

ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 

accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcaggcgcc 240 

attcgccatt caggctgcgc aactgttggg aagggcgatc ggtgcgggcc tcttcgctat 300 

tacgccagct ggcgaaaggg ggatgtgctg caaggcgatt aagttgggta acgccagggt 360 

tttcccagtc acgacgttgt aaaacgacgg ccagtgaatt cgataaaagt gctttttttg 420 

ttgcaattga agaattatta atgttaagct taattaaaga taatatcttt gaattgtaac 480 

gcccctcaaa agtaagaact acaaaaaaag aatacgttat atagaaatat gtttgaacct 540 

tcttcagatt acaaatatat tcggacggac tctacctcaa atgcttatct aactatagaa 

tgacatacaa gcacaacctt gaaaatttga aaatataact accaatgaac ttgttcatgt 

gaattatcgc tgtatttaat tttctcaatt caatatataa tatgccaata cattgttaca 720 

agtagaaatt aagacaccct tgatagcctt actataccta acatgatgta gtattaaatg 780 

aatatgtaaa tatatttatg ataagaagcg acttatttat aatcattaca tatttttcta 840 

ttggaatgat taagattcca atagaatagt gtataaatta tttatcttga aaggagggat 900 

gcctaaaaac gaagaacatt aaaaacatat atttgcaccg tctaatggat ttatgaaaaa 960 

tcattttatc agtttgaaaa ttatgtatta tggagctctg aaaaaaagga gaggataaag 1020 

aatgaagaaa ccgttgggga aaattgtcgc aagcaccgca ctactcattt ctgttgcttt 1080 

tagttcatcg atcgcatcgg ctgctgaaga agcaaaagaa aaatatttaa ttggctttaa 1140 

tgagcaggaa gctgtcagtg agtttgtaga acaagtagag gcaaatgacg aggtcgccat 1200 

tctctctgag gaagaggaag tcgaaattga attgcttcat gaatttgaaa cgattcctgt 1260 

tttatccgtt gagttaagcc cagaagatgt ggacgcgctt gaactcgatc cagcgatttc 1320 

ttatattgaa gaggatgcag aagtaacgac aatggcgcaa tcggtaccat ggggtatatc 1380 
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aacgcgttaa tccgcggata tatagcggcc gcagatctgg gaccaataat aatgactaga 1440 

gaagaaagaa tgaagattgt tcatgaaatt aaggaacgaa tattggataa agtgggatat 1500 

ttttaaaata tatatttatg ttacagtaat attgactttt aaaaaaggat tgattctaat 1560 

gaagaaagca gacaagtaag cctcctaaat tcactttaga taaaaattta ggaggcatat 1620 

caaatgaact ttaataaaat tgatttagac aattggaaga gaaaagagat atttaatcat 1680 

tatttgaacc aacaaacgac ttttagtata accacagaaa ttgatattag tgttttatac 1740 

cgaaacataa aacaagaagg atataaattt taccctgcat ttattttctt agtgacaagg 1800 

gtgataaact caaatacagc ttttagaact ggttacaata gcgacggaga gttaggttat 1860 

tgggataagt tagagccact ttatacaatt tttgatggtg tatctaaaac attctctggt 1920 

atttggactc ctgtaaagaa tgacttcaaa gagttttatg atttatacct ttctgatgta 1980 

gagaaatata atggttcggg gaaattgttt cccaaaacac ctatacctga aaatgctttt 2040 

tctctttcta ttattccatg gacttcattt actgggttta acttaaatat caataataat. 2100 

agtaattacc ttctacccat tattacagca ggaaaattca ttaataaagg taattcaata 2160 

tatttaccgc tatctttaca ggtacatcat tctgtttgtg atggttatca tgcaggattg 2220 

tttatgaact ctattcagga attgtcagat aggcctaatg actggctttt ataatatgag 2280 

ataatgccga ctgtactttt tacagtcggt tttctaacga tacattaata ggtacgaaaa 2340 

agcaactttt tttgcgctta aaaccagtca taccaataac ttaagggtaa ctagcctcgc 2400 

cggaaagagc gaaaatgcct cacatttgtg ccacctaaaa aggagcgatt tacatatgag 2460 

ttatgcagtt tgtagaatgc aaaaagtgaa atcagctgga ctaaaagggg ccgcagagta 2520 

gaatggaaaa ggggatcgga aaacaagtat ataggaggag acctatttat ggcttcagaa 2580 

aaagacgcag gaaaacagtc agcagtaaag cttgttccat tgcttattac tgtcgctgtg 2640 

ggactaatca tctggtttat tcccgctccg tccggacttg aacctaaagc ttggcatttg 2700 

tttgcgattt ttgtcgcaac aattatcggc tttatctcca agcccttgcc aatgggtgca 2760 

attgcaattt ttgcattggc ggttactgca ctaactggaa cactatcaat tgaggataca 2820 

ttaagcggat tcgggaataa gaccatttgg cttatcgtta tcgcattctt tatttcccgg 2880 

ggatttatca aaaccggtct cggtgcgaga atttcgtatg tattcgttca gaaattcgga 2940 

aaaaaaaccc ttggactttc ttattcactg ctattcagtg atttaatact ttcacctgct 3000 

attccaagta atacggcgcg tgcaggaggc attatatttc ctattatcag atcattatcc 3060 

gaaacattcg gatcaagccc ggcaaatgga acagagagaa aaatcggtgc attcttatta 3120 

aaaaccggtt ttcaggggaa tctgatcaca tctgctatgt tcctgacagc gatggcggcg 3180 

aacccgctga ttgccaagct ggcccatgat gtcgcagggg tggacttaac atggacaagc 3240 

tgggcaattg ccgcgattgt accgggactt gtaagcttaa tcatcacgcc gcttgtgatt 3300 

tacaaactgt atccgccgga aatcaaagaa acaccggatg cggcgaaaat cgcaacagaa 3360 

aaactgaaag aaatgggacc gttcaaaaaa tcggagcttt ccatggttat cgtgtttctt 3420 
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ttggtgcttg 
ttgatcggtt 
gaacagggcg 
ttgaatgaat 
ttctcttgga 
tttgcaagtg 
gcagcgggcg 
gggtcaacga 
ccgcaaggca 
cttgtgatcg 
cgcggtctgc 
aaattccctg 
aagcggttcc 
tttgtcattg 
caggcagagc 
gagctcattg 
aaagaacaga 
gcctctggaa 
tcgcatgttt 
ataaaagaac 
acggagacag 
gtactgctgc 
ctcgggcttg 
gcgattcgag 
tcggcggccg 
atgccgggag 
gtaagcgtca 
gcgtatggga 
ttgacagagg 
aagctttatg 
ggtcatagct 
ccggaagcat 
cgttgcgctc 
tcggccaacg 



tgctgtggat 
tggccgttct 
cttgggatac 
taggcatggt 
ttgtggcatt 
cgacagccca 
caccgccgct 
ctcactacgg 
aatggtggtc 
gcggattatg 
ctttttttat 
tgaaaaatgg 
atttttccct 
gtgtgctgac 
agctggcggt 
agagaaaaga 
ctggtgcgtt 
aaagcggatt 
ctgaaacaaa 
agaagggata 
agcaaagcat 
tcggatttat 
aaccgcatga 
aagggattat 
agatgctgaa 
cagggctgat 
acgatcaagt 
ttgtcgtcag 
ttcgcaaata 
cgattttagg 
gtttcctgtg 
aaagtgtaaa 
actgcccgct 
cgcggggaga 
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ttttggcggc agcttcaaca tcgacgctac 
cttattatca caagttctga cttgggatga 
gctcacttgg tttgcggcgc ttgtcatgct 
gtcttggttc agtaatgcca tgaaatcatc 
catcatttta attgttgtgt attattactc 
catcagtgcg atgtattcag catttttggc 
tttagcagcg ctgagcctcg cgttcatcag 
ttctggagcg gctccggtct tcttcggagc 
catcggattt atcctgtcga ttgttcatat 
gtggaaagta ctaggaatat ggtagaaaga 
tttcactcct tcgtaagaaa atggattttg 
tatgatctag gtagaaagga cggctggtgc 
gcaaacaaaa ataatggggc tgattgcggc 
cattacgtta gccgttcagc atacacaggg 
tcaaacggcg agaaccattt cctatatgcc 
cggacatgcg gctcagacgc aagaggtcat 
tgccatttat gttttgaacg aaaaaggaga 
aaagaaactg gagcgcagca gagaaatttt 
agcggatgga cgaagagtga tcagagggag 
cagccaagtg atcggcagcg tgtctgttga 
caaaaagcat ttgagaaatt tgagtgtgat 
tggcgccgcc gtgctggcga aaagcatcag 
gatcgcggct ctatatcgtg agaggaacgc 
tgccaccaat cgtgaaggcg tcgtcaccat 
gctgcccgag cctgtgatcc atcttcctat 
gtctgtgctt gaaaaaggag aaatgctgcc 
gtttattatc aatacgaaag tgatgaatca 
cttcagggag aaaacagagc tgaagaagct 
ttcagaggat ctcagggcgc agactcatga 
gctgcgtcga cctgcaggca tgcaagcttg 
tgaaattgtt atccgctcac aattccacac 
gcctggggtg cctaatgagt gagctaactc 
ttccagtcgg gaaacctgtc gtgccagctg 



ggcggtttgc gtattgggcg ctcttccgct 
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cacaaccgca 
tatcaagaaa 
cgccaacttc 
cgtatcaggg 
tcactatttc 
tgtcgtcgtg 
caacctgttc 
aggctacatc 
catcgtatgg 
aaaaggcaga 
aaaaatgaga 
tgtggtgaaa 
tctgctggtc 
agaacggaga 
gccggttaaa 
tgaacaaatg 
cattcgcagc 
gtttggcggt 
cgcgccgatt 
ttttctgcaa 
tgctgtgctt 
aaaggatacg 
aatgcttttc 
gatgaacgta 
agatgacgtc 
gaaccaggaa 
aggcgggcag 
gatcgacaca 
attttcaaat 
gcgtaatcat 
aacatacgag 
acattaattg 
cattaatgaa 
tcctcgctca 



3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 
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ctgactcgct 
taatacggtt 
agcaaaaggc 
cccctgacga 
tataaagata 
tgccgcttac 
gctcacgctg 
acgaaccccc 
acccggtaag 
cgaggtatgt 
gaaggacagt 
gtagctcttg 
agcagattac 
ctgacgctca 
ggatcttcac 
atgagtaaac 
tctgtctatt 
gggagggctt 
caccggattt 
caactttatc 
cgccagttaa 
cgtcgtttgg 
cccccatgtt 
agttggccgc 
tgccatccgt 
agtgtatgcg 
atagcagaac 
ggatcttacc 
cagcatcttt 
caaaaaaggg 
attattgaag 
agaaaaataa 
aagaaaccat 

gtc 



gcgctcggtc 
atccacagaa 
caggaaccgt 
gcatcacaaa 
ccaggcgttt 
cggatacctg 
taggtatctc 
cgttcagccc 
acacgactta 
aggcggtgct 
atttggtatc 
atccggcaaa 
gcgcagaaaa 
gtggaacgaa 
ctagatcctt 
ttggtctgac 
tcgttcatcc 
accatctggc 
atcagcaata 
cgcctccatc 
tagtttgcgc 
tatggcttca 
gtgcaaaaaa 
agtgttatca 
aagatgcttt 
gcgaccgagt 
tttaaaagtg 
gctgttgaga 
tactttcacc 
aataagggcg 
catttatcag 
acaaataggg 
tattatcatg 
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gttcggctgc ggcgagcggt atcagctcac 
tcaggggata acgcaggaaa gaacatgtga 
aaaaaggccg cgttgctggc gtttttccat 
aatcgacgct caagtcagag gtggcgaaac 
ccccctggaa gctccctcgt gcgctctcct 
tccgcctttc tcccttcggg aagcgtggcg 
agttcggtgt aggtcgttcg ctccaagctg 
gaccgctgcg ccttatccgg taactatcgt 
tcgccactgg cagcagccac tggtaacagg 
acagagttct tgaagtggtg gcctaactac 
tgcgctctgc tgaagccagt taccttcgga 
caaaccaccg ctggtagcgg tggttttttt 
aaaggatctc aagaagatcc tttgatcttt 
aactcacgtt aagggatttt ggtcatgaga 
ttaaattaaa aatgaagttt taaatcaatc 
agttaccaat gcttaatcag tgaggcacct 
atagttgcct gactccccgt cgtgtagata 
cccagtgctg caatgatacc gcgagaccca 
aaccagccag ccggaagggc cgagcgcaga 
cagtctatta attgttgccg ggaagctaga 
aacgttgttg ccattgctac aggcatcgtg 
ttcagctccg gttcccaacg atcaaggcga 
gcggttagct ccttcggtcc tccgatcgtt 
ctcatggtta tggcagcact gcataattct 
tctgtgactg gtgagtactc aaccaagtca 
tgctcttgcc cggcgtcaat acgggataat 
ctcatcattg gaaaacgttc ttcggggcga 
tccagttcga tgtaacccac tcgtgcaccc 
agcgtttctg ggtgagcaaa aacaggaagg 
acacggaaat gttgaatact catactcttc 
ggttattgtc tcatgagcgg atacatattt 
gttccgcgca catttccccg aaaagtgcca 
acattaacct ataaaaatag gcgtatcacg 
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tcaaaggcgg 
gcaaaaggcc 
aggctccgcc 
ccgacaggac 
gttccgaccc 
ctttctcata 
ggctgtgtgc 
cttgagtcca 
attagcagag 
ggctacacta 
aaaagagttg 
gtttgcaagc 
tctacggggt 
ttatcaaaaa 
taaagtatat 
atctcagcga 
actacgatac 
cgctcaccgg 
agtggtcctg 
gtaagtagtt 
gtgtcacgct 
gttacatgat 
gtcagaagta 
cttactgtca 
ttctgagaat 
accgcgccac 
aaactctcaa 
aactgatctt 
caaaatgccg 
ctttttcaat 
gaatgtattt 
cctgacgtct 
aggccctttc 



5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7443 
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10423 . 204-WO. ST25 . txt 

<210> 22 
<211> 5718 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Sequence of MB1510 genomic integration region 
<400> 22 

gagcgccgtt tggctgaatg atacaacagt ctcacttcct tactgcgtct 



acgaagaagc aaggattccc ctcgcttctc atttgtccta tttattatac 
gcacatcttt ggcgcttgtt tcactagact tgatgcctct gaatcttgtc 
ggtccgcatc atagacttgt ccatttttca ccgctttgag atttttccag 
ttttccactc atctacaatg gttttgcctt cgttggctga gatgaacaaa 
cgattttgct caattgctca aggctgacct cttgataggc gttatctgac 
gtgtaaagcc tagcatttta aagatttctc cgtcatagga tgatgatgta 
aggaatccgc tcttgcaacg ccgagaacga tgttgcggtt ttcatctttc 
cttttagatc gttgatgact tttttgtgct cggcaagctt ttcttttcct 
tatttaatgc tttagcaatg gtcgtaaagc tgtcgatcgt ttcgtcatat 
ggctttttaa ttcaatcgtc ggggcgattt ttttcagctg tttataaatg 
gctcagcgtc agcgatgatt aaatcaggct tcaaggaact gatgacctca 
cgctgcgtgt gcctacagat gtgtaatcaa tggagctgcc gacaagcttt 
cttttttgtt gtcatctgcg atgcccaccg gcgtaatgcc gagattgtga 
agaatgaaag ctcaagcaca accacccgct taggtgtgcc gcttactgtc 
cttcgtcatg gatcactctg gaatccttag actcgctttt gccgcttccg 
ggcttgatga acagccggat acaatgaggc aggcgagcaa taaaacactc 
tcaacttgtt agaataggtg cgcatgtcat tcttcctttt ttcagattta 
tcattatcac atgtaacact ataatagcat ggcttatcat gtcaatattt 
gaaagctgcg tttttactgc tttctcatga aagcatcatc agacacaaat 
cagcgttacc gtgtcttcga gacaaaaacg catgggcgtt ggctttagag 
tatcagcagt gacataagga aggagagtgc tgagataacc ggacaatttc 
catctgttag tgcaaattca atgtcgccga tattcatgat aatcgagaaa 
tatcgatatg aaaatgttcc tcggcaaaaa ccgcaagctc gtgaattcct 
cggcacgctt atggaaaatc tgtttgacta aatcactcac aatccaagca 
gttctggtga aaagtattgc attagacata cctcctgctc gtacggataa 
tcatggtcgt gtgctccgtg cagcggcttc tccttaattt tgatttttct 
cccgttccta tcactttacc atggacggaa aacaaatagc tactaccatt 
ttctcttcaa tgttctggaa tctgtttcag gtacagacga tcgggtatga 
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ggttgcaaaa 
acttttttaa 
caagtgtcac 
agcgggttcg 
atatcaggat 
ttcacagcgt 
tgaagctgga 
ggaagttcgg 
tcatcttctt 
gtcgcttcac 
tttttatggc 
agattgggtt 
ttaatcatat 
acggcatcca 
gtttttcctt 
ttgttattct 
atgatggcaa 
gtaatgagaa 
ttttagtaaa 
aagtggtatg 
gtttcgaaca 
ttttctattt 
acaaagtcga 
ggtgaacatc 
ttgtattgct 
aggcagcgtt 
gaaaataggt 
cctcctgttt 
aagaaatata 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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gaaaacatga aggaggaata tcgacatgaa accagttgta aaagagtata caaatgacga 1800 

acagctcatg aaagatgtag aggaattgca gaaaatgggt gttgcgaaag aggatgtata 1860 

cgtcttagct cacgacgatg acagaacgga acgcctggct gacaacacga acgccaacac 1920 

gatcggagcc aaagaaacag gtttcaagca cgcggtggga aatatcttca ataaaaaagg 1980 

agacgagctc cgcaataaaa ttcacgaaat cggtttttct gaagatgaag ccgctcaatt 2040 

tgaaaaacgc ttagatgaag gaaaagtgct tctctttgtg acagataacg aaaaagtgaa 2100 

agcttgggca taaagcaagg aaaaaaccaa aaggccaatg tcggcctttt ggtttttttg 2160 

cggtctttgc ggtgggattt tgcagaatgc cgcaatagga tagcggaaca ttttcggttc 2220 

tgaatgtccc tcaatttgct attatatttt tgtgataaat tggaataaaa tctcacaaaa 2280 

tagaaaatgg gggtacatag tggatgaaaa aagtgatgtt agctacggct ttgtttttag 2340 

gattgactcc agctggcgcg aacgcagctg atttaggcca ccagacgttg ggatccaatg 2400 

atggctgggg cgcgtactcg accggcacga caggcggatc aaaagcatcc tcctcaaatg 2460 

tgtataccgt cagcaacaga aaccagcttg tctcggcatt agggaaggaa acgaacacaa 2520 

cgccaaaaat catttatatc aagggaacga ttgacatgaa cgtggatgac aatctgaagc 2580 

cgcttggcct aaatgactat aaagatccgg agtatgattt ggacaaatat ttgaaagcct 2640 

atgatcctag cacatggggc aaaaaagagc cgtcgggaac acaagaagaa gcgagagcac 2700 

gctctcagaa aaaccaaaaa gcacgggtca tggtggatat ccctgcaaac acgacgatcg 2760 

tcggttcagg gactaacgct aaagtcgtgg gaggaaactt ccaaatcaag agtgataacg 2820 

tcattattcg caacattgaa ttccaggatg cctatgacta ttttccgcaa tggttgtaaa 2880 

acgacggcca gtgaattctg atcaaatggt tcagtgagag cgaagcgaac acttgatttt 2940 

ttaattttct atcttttata ggtcattaga gtatacttat ttgtcctata aactatttag 3000 

cagcataata gatttattga ataggtcatt taagttgagc atattagagg aggaaaatct 3060 

tggagaaata tttgaagaac ccgagatcta gatcaggtac cgcaacgttc gcagatgctg 3120 

ctgaagagat tattaaaaag ctgaaagcaa aaggctatca attggtaact gtatctcagc 3180 

ttgaagaagt gaagaagcag agaggctatt gaataaatga gtagaaagcg ccatatcggc 3240 

gcttttcttt tggaagaaaa tatagggaaa atggtacttg ttaaaaattc ggaatattta 3300 

tacaatatca tatgtatcac attgaaagga ggggcctgct gtccagactg tccgctgtgt 3360 

aaaaataagg aataaagggg ggttgacatt attttactga tatgtataat ataatttgta 3420 

taagaaaatg gaggggccct cgaaacgtaa gatgaaacct tagataaaag tgcttttttt 3480 

gttgcaattg aagaattatt aatgttaagc ttaattaaag ataatatctt tgaattgtaa 3540 

cgcccctcaa aagtaagaac tacaaaaaaa gaatacgtta tatagaaata tgtttgaacc 3600 

ttcttcagat tacaaatata ttcggacgga ctctacctca aatgcttatc taactataga 3660 

atgacataca agcacaacct tgaaaatttg aaaatataac taccaatgaa cttgttcatg 3720 

tgaattatcg ctgtatttaa ttttctcaat tcaatatata atatgccaat acattgttac 3780 
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aagtagaaat taagacaccc 
gaatatgtaa atatatttat 
attggaatga ttaagattcc 
tgcctaaaaa cgaagaacat 
atcattttat cagtttgaaa 
gagaaaaggg gatcggaaaa 
gacgcaggaa aacagtcagc 
ctaatcatct ggtttattcc 
gcgatttttg tcgcaacaat 
gcaatttttg cattggcggt 
agcggattcg ggaataagac 
tttatcaaaa ccggtctcgg 
aaaacccttg gactttctta 
ccaagtaata cggcgcgtgc 
acattcggat caagcccggc 
accggttttc aggggaatct 
ccgctgattg ccaagctggc 
gcaattgccg cgattgtacc 
aaactgtatc cgccggaaat 
ctgaaagaaa tgggaccgtt 
gtgcttgtgc tgtggatttt 
atcggtttgg ccgttctctt 
cagggcgctt gggatacgct 
aatgaattag gcatggtgtc 
tcttggattg tggcattcat 
gcaagtgcga cagcccacat 
gcgggcgcac cgccgctttt 
tcaacgactc actacggttc 
caaggcaaat ggtggtccat 
gtgatcggcg gattatggtg 
ggtctgcctt tttttatttt 
ttccctgtga aaaatggtat 
cggttccatt tttccctg 



10423 . 204-WO. ST25 . txt 
ttgatagcct tactatacct aacatgatgt 
gataagaagc gacttattta taatcattac 
aatagaatag tgtataaatt atttatcttg 
taaaaacata tatttgcacc gtctaatgga 
attatgtatt atggagctct gaaaaaaagg 
caagtatata ggaggagacc tatttatggc 
agtaaagctt gttccattgc ttattactgt 
cgctccgtcc ggacttgaac ctaaagcttg 
tatcggcttt atctccaagc ccttgccaat 
tactgcacta actggaacac tatcaattga 
catttggctt atcgttatcg cattctttat 
tgcgagaatt tcgtatgtat tcgttcagaa 
ttcactgcta ttcagtgatt taatactttc 
aggaggcatt atatttccta ttatcagatc 
aaatggaaca gagagaaaaa tcggtgcatt 
gatcacatct gctatgttcc tgacagcgat 
ccatgatgtc gcaggggtgg acttaacatg 
gggacttgta agcttaatca tcacgccgct 
caaagaaaca ccggatgcgg cgaaaatcgc 
caaaaaatcg gagctttcca tggttatcgt 
tggcggcagc ttcaacatcg acgctaccac 
attatcacaa gttctgactt gggatgatat 
cacttggttt gcggcgcttg tcatgctcgc 
ttggttcagt aatgccatga aatcatccgt 
cattttaatt gttgtgtatt attactctca 
cagtgcgatg tattcagcat ttttggctgt 
agcagcgctg agcctcgcgt tcatcagcaa 
tggagcggct ccggtcttct tcggagcagg 
cggatttatc ctgtcgattg ttcatatcat 
gaaagtacta ggaatatggt agaaagaaaa 
cactccttcg taagaaaatg gattttgaaa 
gatctaggta gaaaggacgg ctggtgctgt 



agtattaaat 
atatttttct 
aaaggaggga 
tttatgaaaa 
agaggataaa 
ttcagaaaaa 
cgctgtggga 
gcatttgttt 
gggtgcaatt 
ggatacatta 
ttcccgggga 
attcggaaaa 
acctgctatt 
attatccgaa 
cttattaaaa 
ggcggcgaac 
gacaagctgg 
tgtgatttac 
aacagaaaaa 
gtttcttttg 
aaccgcattg 
caagaaagaa 
caacttcttg 
atcagggttc 
ctatttcttt 
cgtcgtggca 
cctgttcggg 
ctacatcccg 
cgtatggctt 
aggcagacgc 
aatgagaaaa 
ggtgaaaaag 



3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5718 
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<210> 23 10423.204-WO.ST25.txt 

<211> 27 

<212> DMA 

<213> Artificial sequence 
<220> 

<223> Primer 1605 

<400> 23 

gacggccagt gaattcgata aaagtgc 

<210> 24 

<211> 42 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 1606 
<220> 

<221> misc_feature 

<222> (13) . . (13) 

<223> n is a, c, g, or t 

<220> 

<221> misc_feature 

<222> (16) . . (16) 

<223> n is a, c, g, or t 

<400> 24 

ccagatctct atnktnktgt acggagtcta actccccaag ag 



27 



42 



<210> 25 
<211> 1112 
<212> DNA 

<213> Nocardiopsis dassonvillei dsm 43235 
<400> 25 

gcttttagtt catcgatcgc atcggctgct ccggcccccg tcccccagac ccccgtcgcc 60 

gacgacagcg ccgccagcat gaccgaggcg ctcaagcgcg acctcgacct cacctcggcc 120 

gaggccgagg agcttctctc ggcgcaggaa gccgccatcg agaccgacgc cgaggccacc 180 

gaggccgcgg gcgaggccta cggcggctca ctgttcgaca ccgagaccct cgaactcacc 240 

gtgctggtca ccgacgcctc cgccgtcgag gcggtcgagg ccaccggagc ccaggccacc 300 

gtcgtctccc acggcaccga gggcctgacc gaggtcgtgg aggacctcaa cggcgccgag 360 

gttcccgaga gcgtcctcgg ctggtacccg gacgtggaga gcgacaccgt cgtggtcgag 420 

gtgctggagg gctccgacgc cgacgtcgcc gccctgctcg ccgacgccgg tgtggactcc 480 

tcctcggtcc gggtggagga ggccgaggag gccccgcagg tctacgccga catcatcggc 540 

ggcctggcct actacatggg cggccgctgc tccgtcggct tcgccgcgac caacagcgcc 600 

ggtcagcccg gtttcgtcac cgccggccac tgcggcaccg tcggcaccgg cgtgaccatc 660 

ggcaacggca ccggcacctt ccagaactcg gtcttccccg gcaacgacgc cgccttcgtc 720 

cgcggcacct ccaacttcac cctgaccaac ctggtctcgc gctacaactc cggcggctac 780 

cagtcggtga ccggtaccag ccaggccccg gccggctcgg ccgtgtgccg ctccggctcc 840 
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accaccggct ggcactgcgg caccatccag gcccgcaacc agaccgtgcg ctacccgcag 900 

ggcaccgtct actcgctcac ccgcaccaac gtgtgcgccg agcccggcga ctccggcggt 960 

tcgttcatct ccggctcgca ggcccagggc gtcacctccg gcggctccgg caactgctcc 1020 

gtcggcggca cgacctacta ccaggaggtc accccgatga tcaactcctg gggtgtcagg 1080 

atccggacct aatcgcatgt tcaatccgct cc 1112 

<210> 26 

<211> 48 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 1423 

<400> 26 

gcttttagtt catcgatcgc atcggctgct ccggcccccg tcccccag 48 

<210> 27 

<211> 45 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 1475 

<400> 27 

ggagcggatt gaacatgcga ttaggtccgg atcctgacac cccag 45 

<210> 28 

<211> 354 

<212> prt 

<213> Nocardiopsis dassonvillei DSM 43235 
<220> 

<221> PROPEP 

<222> (1) . . (166) 

<220> 

<221> mat_peptide 
<222> (167).. (354) 

<400> 28 

Ala Pro Ala Pro val pro Gin Thr Pro Val Ala Asp Asp Ser Ala 
-Xbb -160 -155 

Ala ser Met Thr Glu Ala Leu Lys Arg Asp Leu Asp Leu Thr ser 
~-L->0 -145 -140 

Ala Glu Ala Glu Glu Leu Leu Ser Ala Gin Glu Ala Ala He Glu 
-1 35 -130 -125 

Thr Asp Ala Glu Ala Thr Glu Ala Ala Gly Glu Ala Tyr Gly Gly 
-1^0 -115 -HO ' 
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ser Leu Phe Asp Thr Glu Thr Leu Glu Leu Thr val Leu Val Thr asd 
-105 -100 -95 

Ala Ser Ala Val Glu Ala val Glu Ala Thr Gly Ala Gin Ala Thr Val 
~ yu "85 -80 -75 

val Ser His Gly Thr Glu Gly Leu Thr Glu Val val Glu Asp Leu Asn 
~ 70 -65 -60 

Gly Ala Glu val Pro Glu Ser Val Leu Gly Trp Tyr pro Asp val Glu 
-55 -50 _45 

ser Asp Thr val val Val Glu Val Leu Glu Gly ser Asp Ala Asp val 

-35 -30 

Ala Ala Leu Leu Ala Asp Ala Gly Val Asp ser ser ser Val Arg val 
-25 -20 -15 

Glu Glu Ala Glu Glu Ala Pro Gin Val Tyr Ala Asp He lie Gly Gly 

Leu Ala Tyr Tyr Met Gly Gly Arg Cys Ser val Gly Phe Ala Ala Thr 
w 15 20 

Asn ser Ala Gly Gin Pro Gly Phe Val Thr Ala Gly His Cys Gly Thr 
" 30 35 

val Gly Thr Gly Val Thr lie Gly Asn Gly Thr Gly Thr Phe Gin Asn 

45 5Q 

ser val Phe Pro Gly Asn Asp Ala Ala Phe Val Arg Gly Thr Ser Asn 
" 60 65 70 

Phe Thr Leu Thr Asn Leu Val Ser Arg Tyr Asn Ser Gly Gly Tyr Gin 
75 80 85 

Ser val Thr Gly Thr Ser Gin Ala Pro Ala Gly Ser Ala Val Cys Ara 
90 95 100 

ser Gly ser Thr Thr Gly Trp His cys Gly Thr lie Gin Ala Arg Asn 
■* 110 115 

Gin Thr val Arg Tyr pro Gin Gly Thr val Tyr Ser Leu Thr Arg Thr 

125 130 

Asn val cys Ala Glu Pro Gly Asp ser Gly Gly Ser Phe He ser Gly 

140 145 150 

ser Gin Ala Gin Gly val Thr Ser Gly Gly ser Gly Asn cys Ser Val 
-L55 160 165 
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fly «ly Thr Thr Tyr Tyr 61 „ ^V^rTr^xfe Asn ser Trp 
x/u 175 180 

Gly Val Arg He Arg Thr 
185 

<210> 29 
<211> 498 
<212> DNA 

<213> Nocardiopsis dassonvillei DSM 43235 
<400> 29 

gctccggccc ccgtccccca gacccccgtc gccgacgaca gcgccgccag catgaccgag 60 
gcgctcaagc gcgacctcga cctcacctcg gccgaggccg aggagcttct ctcggcgcag 120 
gaagccgcca tcgagaccga cgccgaggcc accgaggccg cgggcgaggc ctacggcggc 180 
tcactgttcg acaccgagac cctcgaactc accgtgctgg tcaccgacgc ctccgccgtc 240 
gaggcggtcg aggccaccgg agcccaggcc accgtcgtct cccacggcac cgagggcctg 300 
accgaggtcg tggaggacct caacggcgcc gaggttcccg agagcgtcct cggctggtac 360 
ccggacgtgg agagcgacac cgtcgtggtc gaggtgctgg agggctccga cgccgacgtc 420 
gccgccctgc tcgccgacgc cggtgtggac tcctcctcgg tccgggtgga ggaggccgag 480 
gaggccccgc aggtctac . 



<210> 30 
<211> 166 
<212> prt 

<213> Nocardiopsis dassonvillei DSM 43235 
<400> 30 

Ala Pro Ala Pro val Pro Gin Thr Pro val Ala Asp Asp ser Ala Ala 
3 10 15 

Ser wet Thr Glu Ala Leu Lys Arg Asp Leu Asp Leu Thr Ser Ala Glu 
zu 25 30 

Ala Glu Glu Leu Leu Ser Ala Gin Glu Ala Ala He Glu Thr Asp Ala 

Glu Ala Thr Glu Ala Ala Gly Glu Ala Tyr Gly Gly Ser Leu Phe Asp 

Thr Glu Thr Leu Glu Leu Thr Val Leu val Thr Asp Ala ser Ala val 

75 80 

Glu Ala val Glu Ala Thr Gly Ala Gin Ala Thr val val ser His Gly 
°-> 90 95 

Thr Glu Gly Leu Thr Glu val val Glu Asp Leu Asn Gly Ala Glu val 
±vv 105 1X0 



498 
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Pro Glu Ser Val Leu Gly Trp Tyr pro Asp val Glu Ser Asp Thr Val 
115 120 125 

val Val Glu Val Leu Glu Gly ser Asp Ala Asp val Ala Ala Leu Leu 
X3yj 135 140 

Ala Asp Ala Gly val Asp ser ser ser Val Arg val Glu Glu Ala Glu 

150 155 160 

Glu Ala Pro Gin val Tyr 
165 

<210> 31 
<211> 1146 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> The DNA sequence coding for the pro-region of seq id no- 29 fuspri 
<400> 31 

atgaagaaac cgttggggaa aattgtcgca agcaccgcac tactcatttc tgttgctttt 60 

agttcatcga tcgcatcggc tgctccggcc cccgtccccc agacccccgt cgccgacgac 120 

agcgccgcca gcatgaccga ggcgctcaag cgcgacctcg acctcacctc ggccgaggcc 180 

gaggagcttc tctcggcgca ggaagccgcc atcgagaccg acgccgaggc caccgaggcc 240 

gcgggcgagg cctacggcgg ctcactgttc gacaccgaga ccctcgaact caccgtgctg 300 

gtcaccgacg cctccgccgt cgaggcggtc gaggccaccg gagcccaggc caccgtcgtc 360 

tcccacggca ccgagggcct gaccgaggtc gtggaggacc tcaacggcgc cgaggttccc 420 

gagagcgtcc tcggctggta cccggacgtg gagagcgaca ccgtcgtggt cgaggtgctg 480 

gagggctccg acgccgacgt cgccgccctg ctcgccgacg ccggtgtgga ctcctcctcg 540 

gtccgggtgg aggaggccga ggaggccccg caggtctatg ccgatatcat tggaggccta 600 

gcgtacacaa tgggtggtcg ctgcagcgta ggatttgcag ccacaaatgc agctggacaa 660 

cctggcttcg tgacagctgg acattgcggc cgcgtcggta cacaggttac tatcggcaat 720 

ggaagaggtg tctttgagca aagcgtattt cccgggaatg atgctgcctt cgttagaggt 780 

acgtccaact ttacgcttac taacttagta tctagataca acactggcgg atatgcaact 840 

gtagcaggtc acaatcaagc acctattggc tctagcgtct gccgctcagg gtcgactaca 900 

ggatggcatt gtggaaccat tcaagctaga ggtcagagcg tgagctatcc tgaaggtacc 960 

gtaacgaaca tgactcgtac gactgtatgt gcagaaccag gtgactctgg aggttcatat 1020 

atcagcggta cgcaagcgca aggcgttacc tcaggtggat ccggtaactg taggacaggt 1080 

ggcacaacgt tctaccagga agtgacaccg atggtgaact cttggggagt tagactccgt 1140 

aC3taa " 1146 
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<210> 32 10423.204-WO.ST25.txt 
<211> 1068 
<212> DNA 

<213> Nocardiopsis Alba DSM 15647 
<400> 32 

gcgaccggcc ccctccccca gtcccccacc ccggatgaag ccgaggccac caccatggtc 60 

gaggccctcc agcgcgacct cggcctgtcc ccctctcagg ccgacgagct cctcgaggcg 120 

caggccgagt ccttcgagat cgacgaggcc gccaccgcgg ccgcagccga ctcctacggc 180 

ggctccatct tcgacaccga cagcctcacc ctgaccgtcc tggtcaccga cgcctccgcc 240 

gtcgaggcgg tcgaggccgc cggcgccgag gccaaggtgg tctcgcacgg catggagggc 300 

ctggaggaga tcgtcgccga cctgaacgcg gccgacgctc agcccggcgt cgtgggctgg 360 

taccccgaca tccactccga cacggtcgtc ctcgaggtcc tcgagggctc cggtgccgac 420 

gtggactccc tgctcgccga cgccggtgtg gacaccgccg acgtcaaggt ggagagcacc 480 

accgagcagc ccgagctgta cgccgacatc atcggcggtc tcgcctacac catgggtggg 540 

cgctgctcgg tcggcttcgc ggccaccaac gcctccggcc agcccgggtt: cgtcaccgcc 600 

ggccactgcg gcaccgtcgg caccccggtc agcatcggca acggccaggg cgtcttcgag 660 

cgttccgtct tccccggcaa cgactccgcc ttcgtccgcg gcacctcgaa cttcaccctg 720 

accaacctgg tcagccgcta caacaccggt ggttacgcga ccgtctccgg ctcctcgcag 780 

gcggcgatcg gctcgcagat ctgccgttcc ggctccacca ccggctggca ctgcggcacc 840 

gtccaggccc gcggccagac ggtgagctac ccccagggca ccgtgcagaa cctgacccgc 900 

accaacgtct gcgccgagcc cggtgactcc ggcggctcct tcatctccgg cagccaggcc 960 

cagggcgtca cctccggtgg ctccggcaac tgctccttcg gtggcaccac ctactaccag 1020 

gaggtcaacc cgatgctgag cagctggggt ctgaccctgc gcacctga 1068 

<210> 33 

<211> 355 

<212> prt 

<213> Nocardiopsis Alba DSM 15647 
<220> 

<221> PROPEP 

<222> (1) . . (167) 

<220> 

<221> mat_peptide 
<222> (168).. (355) 

<400> 33 

Ala Thr Gly Pro Leu pro Gin ser Pro Thr Pro Asp Glu Ala Glu 
-•">-> -160 -155 

Ala Thr Thr Met Val Glu Ala Leu Gin Arg Asp Leu Gly Leu ser 
~ x:>u -145 -140 

Pro ser Gin Ala Asp Glu Leu Leu Glu Ala Gin Ala Glu ser Phe 
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-135 -130 _125 

Glu lie Asp Glu Ala Ala Thr Ala Ala Ala Ala Asp ser Tyr Gly 
-i-*v -115 -110 

Gly Ser lie Phe Asp Thr Asp ser Leu Thr Leu Thr val Leu Val Thr 
-105 -100 -95 

Asp Ala Ser Ala Val Glu Ala val Glu Ala Ala Gly Ala Glu Ala Lys 
»u -85 -80 

val val ser His Gly Met Glu Gly Leu Glu Glu He val Ala Asp Leu 

~ 70 -65 -60 

Asn Ala Ala Asp Ala Gin Pro Gly val Val Gly Trp Tyr Pro Asp He 
- 55 -50 J _45 

His ser Asp Thr val Val Leu Glu Val Leu Glu Gly ser Gig Ala Asp 

val Asp ser Leu Leu Ala Asp Ala Gly val Asp Thr Ala Asp val Lys 
~" -20 -15 

val Glu ser Thr Thr Glu Gin Pro Glu Leu Tyr Ala Asp lie lie Gly 
-J-U -5 -11 5 

Gly Leu Ala Tyr Thr Met Gly Gly Arg cys Ser val Gly Phe Ala Ala 
10 15 20 

Thr Asn Ala Ser Gly Gin Pro Gly Phe val Thr Ala Gly His cys Gly 

Thr Val Gly Thr Pro Val ser lie Gly Asn Gly Gin Gly val Phe Glu 
HV 45 50 

Arg ser val Phe Pro Gly Asn Asp ser Ala Phe Val Arg Gly Thr ser 
" 60 65 

Asn phe Thr Leu Thr Asn Leu Val ser Arg Tyr Asn Thr Gly Gly Tyr 
/u 75 80 85 

Ala Thr val ser Gly ser ser Gin Ala Ala lie Gly ser Gin lie Cys 
yu 95 100 

Arg ser Gly ser Thr Thr Gly Trp His cys Gly Thr Val Gin Ala Arg 

110 115 9 

Gly Gin Thr Val Ser Tyr Pro Gin Gly Thr Val Gin Asn Leu Thr Arg 
• LZU 125 130 

Thr Asn val cys Ala Glu Pro Gly Asp ser Gly Gly ser Phe lie Ser 
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- L3 - > 140 145 

Gly ser Gin Ala Gin Gly Val Thr ser Gly Gly Ser Gly Asn Cys ser 

155 160 165 

Phe Gly Gly Thr Thr Tyr Tyr Gin Glu Val Asn Pro Met Leu Ser ser 
A/u 175 180 

Trp Gly Leu Thr Leu Arg Thr 
185 

<210> 34 
<211> 43 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 1421 
<400> 34 

gttcatcgat cgcatcggct gcgaccggcc ccctccccca gtc 

<210> 35 

<211> 31 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 1604 

<400> 35 

gcggatccta tcaggtgcgc agggtcagac c 31 

<210> 36 
<211> 1062 
<212> DNA 

<213> Nocardiopsis prasina DSM 15648 
<400> 36 

gccaccggac cgctccccca gtcacccacc ccggaggccg acgccgtctc catgcaggag 60 

gcgctccagc gcgacctcgg cctgaccccg cttgaggccg atgaactgct ggccgcccag 120 

gacaccgcct tcgaggtcga cgaggccgcg gccgcggccg ccggggacgc ctacggcggc 180 

tccgtcttcg acaccgagac cctggaactg accgtcctgg tcaccgacgc cgcctcggtc 240 

gaggctgtgg aggccaccgg cgcgggtacc gaactcgtct cctacggcat cgagggcctc 300 

gacgagatca tccaggatct caacgccgcc gacgccgtcc ccggcgtggt cggctggtac 360 

ccggacgtgg cgggtgacac cgtcgtcctg gaggtcctgg agggttccgg agccgacgtg 420 

agcggcctgc tcgccgacgc cggcgtggac gcctcggccg tcgaggtgac cagcagtgcg 480 

cagcccgagc tctacgccga catcatcggc ggtctggcct acaccatggg cggccgctgt 540 

tcggtcggat tcgcggccac caacgccgcc ggtcagcccg gattcgtcac cgccggtcac 600 

tgtggccgcg tgggcaccca ggtgagcatc ggcaacggcc agggcgtctt cgagcagtcc 660 

atcttcccgg gcaacgacgc cgccttcgtc cgcggcacgt ccaacttcac gctgaccaac 720 
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ctggtcagcc gctacaacac cggcggttac gccaccgtcg ccggccacaa ccaggcgccc 780 

atcggctcct ccgtctgccg ctccggctcc accaccggct ggcactgcgg caccatccag 840 

gcccgcggcc agtcggtgag ctaccccgag ggcaccgtca ccaacatgac ccggaccacc 900 

gtgtgcgccg agcccggcga ctccggcggc tcctacatct ccggcaacca ggcccagggc 960 

gtcacctccg gcggctccgg caactgccgc accggcggga ccaccttcta ccaggaggtc 1020 

acccccatgg tgaactcctg gggcgtccgt ctccggacct aa 1062 

<210> 37 

<211> 353 

<212> PRT 

<213> Nocardiopsis prasina DSM 15648 
<220> 

<221> PROPEP 

<222> (1) . . (165) 

<220> 

<221> mat_peptide 
<222> (166)..C353) 

<400> 37 

Ala Thr Gly Pro Leu Pro Gin ser Pro Thr Pro Glu Ala Asp Ala 
-lo5 -160 -155 

v ?]« Ser Met Gln Glu Ala Leu Gln Ar 9 Asp Leu Gly Leu Thr Pro 
-150 -145 _ 140 

Leu Glu Ala Asp Glu Leu Leu Ala Ala Gln Asp Thr Ala Phe Glu 
-135 -130 _125 

Val Asp Glu Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly 
-120 -115 _1 10 ' ' 

ser val phe Asp Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp 
" iU5 -100 -95 _9q 

Ala Ala ser Val Glu Ala Val Glu Ala Thr Gly Ala Gly Thr Glu Leu 
-85 -80 -75 

Val ser Tyr Gly He Glu Gly Leu Asp Glu He lie Gln Asp Leu Asn 
-70 -65 -60 

Ala Ala Asp Ala val Pro Gly Val Val Gly Trp Tyr pro Asp val Ala 
-55 -50 -45 

Gly Asp Thr val val Leu Glu val Leu Glu Gly ser Gly Ala Asd val 
-40 -35 -30 

Ser Gly Leu Leu Ala Asp Ala Gly val Asp Ala Ser Ala val Glu val 
-« -20 -15 _io 
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Thr ser ser Ala Gin Pro Glu Leu Tyr Ala Asp He He Gly Gly Leu 
-5 -11 5 

Ala Tyr Thr Met Gly Gly Arg Cys Ser Val Gly Phe Ala Ala Thr Asn 
10 15 20 

Ala Ala Gly Gin Pro Gly phe Val Thr Ala Gly His Cys Gly Arg Val 

Gly Thr Gin val ser lie Gly Asn Gly Gin Gly val Phe Glu Gin Ser 
40 45 50 55 

He Phe Pro Gly Asn Asp Ala Ala Phe Val Arg Gly Thr Ser Asn Phe 
60 65 70 

Thr Leu Thr Asn Leu Val ser Arg Tyr Asn Thr Gly Gly Tyr Ala Thr 
75 80 85 

val Ala Gly His Asn Gin Ala Pro lie Gly Ser ser val Cys Arg ser 

Gly ser Thr Thr Gly Trp His cys Gly Thr He Gin Ala Arg Gly Gin 
105 110 115 J 

ser val ser Tyr Pro Glu Gly Thr Val Thr Asn Met Thr Arg Thr Thr 
120 125 130 135 

val cys Ala Glu Pro Gly Asp ser Gly Gly Ser Tyr lie Ser Gly Asn 
140 145 150 

Gin Ala Gin Gly val Thr ser Gly Gly Ser Gly Asn Cys Arg Thr Gly 
155 160 165 

Gly Thr Thr Phe Tyr Gin Glu Val Thr Pro Met val Asn ser Trp Gly 
170 175 180 

Val Arg Leu Arg Thr 
185 

<210> 38 

<211> 43 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 1346 

<400> 38 

gttcatcgat cgcatcggct gccaccggac cgctccccca gtc 43 



<210> 39 
<211> 38 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 1602 

<400> 39 

gcggatccta ttaggtccgg agacggacgc cccaggag 33 

<210> 40 
<211> 1062 
<212> DNA 

<213> Nocardiopsis prasina DSM 15649 
<400> 40 

gccaccggac cactccccca gtcacccacc ccggaggccg acgccgtctc catgcaggag 60 

gcgctccagc gcgacctcgg cctgaccccg cttgaggccg atgaactgct ggccgcccag 120 

gacaccgcct tcgaggtcga cgaggccgcg gccgaggccg ccggtgacgc ctacggcggc 180 

tccgtcttcg acaccgagac cctggaactg accgtcctgg tcaccgactc cgccgcggtc 240 

gaggcggtgg aggccaccgg cgccgggacc gaactggtct cctacggcat cacgggcctc 300 

gacgagatcg tcgaggagct caacgccgcc gacgccgttc ccggcgtggt cggctggtac 360 

ccggacgtcg cgggtgacac cgtcgtgctg gaggtcctgg agggttccgg cgccgacgtg 420 

ggcggcctgc tcgccgacgc cggcgtggac gcctcggcgg tcgaggtgac caccaccgag 480 

cagcccgagc tgtacgccga catcatcggc ggtctggcct acaccatggg cggccgctgt 540 

tcggtcggct tcgcggccac caacgccgcc ggtcagcccg ggttcgtcac cgccggtcac 600 

tgtggccgcg tgggcaccca ggtgaccatc ggcaacggcc ggggcgtctt cgagcagtcc 660 

atcttcccgg gcaacgacgc cgccttcgtc cgcggaacgt ccaacttcac gctgaccaac 720 

ctggtcagcc gctacaacac cggcggctac gccaccgtcg ccggtcacaa ccaggcgccc 780 

atcggctcct ccgtctgccg ctccggctcc accaccggtt ggcactgcgg caccatccag 840 

gcccgcggcc agtcggtgag ctaccccgag ggcaccgtca ccaacatgac gcggaccacc 900 

gtgtgcgccg agcccggcga ctccggcggc tcctacatct ccggcaacca ggcccagggc 960 

gtcacctccg gcggctccgg caactgccgc accggcggga ccaccttcta ccaggaggtc 1020 

acccccatgg tgaactcctg gggcgtccgt ctccggacct aa 1062 

<210> 41 

<211> 353 

<212> prt 

<213> Nocardiopsis prasina DSM 15649 
<220> 

<221> PROPEP 

<222> (1) . . (165) 

<220> 

<221> mat_peptide 
<222> (166)..C353) 

<400> 41 
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Ala Thr Gly pro Leu Pro Gin Ser Pro Thr Pro Glu Ala Asp Ala 
-165 -160 -155 

V ?L Ser Met Gln Glu Ala Leu Gln Ar 9 Asp Leu Gly Leu Thr Pro 
-150 -145 -140 

Leu Glu Ala Asp Glu Leu Leu Ala Ala Gln Asp Thr Ala phe Glu 
-135 -130 -125 

V ?L AS P Glu Ala Ala Ala Glu Ala A "Ia Gly Asp Ala Tyr Gly Glv 
-120 -115 -110 

S ?£r Val phe Asp Thr Glu Thr Leu Glu Leu Thr v al Leu val Thr Asp 
-105 -100 -95 -go 

Ser Ala Ala val Glu Ala Val Glu Ala Thr Gly Ala Gly Thr Glu Leu 
-85 -80 -75 

val ser Tyr Gly lie Thr Gly Leu Asp Glu lie Val Glu Glu Leu Asn 
-70 -65 -60 

Ala Ala Asp Ala val Pro Gly val val Gly Trp Tyr Pro Asp Val Ala 
-55 -50 -45 

Gly Asp Thr val val Leu Glu Val Leu Glu Gly ser Gly Ala asd Val 
-40 -35 -30 

Gly Gly Leu Leu Ala Asp Ala Gly val Asp Ala ser Ala Val Glu val 
-25 -20 -15 -io 

Thr Thr Thr Glu Gln pro Glu Leu Tyr Ala Asp lie lie Gly Gly Leu 
-5 -11 5 

Ala Tyr Thr Met Gly Gly Arg cys ser val Gly phe Ala Ala Thr Asn 
10 15 20 

Ala Ala Gly Gln Pro Gly Phe val Thr Ala Gly His Cys Gly Arg val 

Gly Thr Gln Val Thr lie Gly Asn Gly Arg Gly val Phe Glu Gln ser 
40 45 50 55 

lie Phe Pro Gly Asn Asp Ala Ala Phe val Arg Gly Thr Ser Asn Phe 
60 65 70 

Thr Leu Thr Asn Leu val Ser Arg Tyr Asn Thr Gly Gly Tyr Ala Thr 
75 80 85 

val Ala Gly His Asn Gln Ala Pro lie Gly ser ser val cys Arg ser 
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Gly Ser Thr Thr Gly Trp His Cys Gly Thr He Gin Ala Arg Gly Gin 
105 HO 115 

Ser val Ser Tyr Pro Glu Gly Thr val Thr Asn Met Thr Arg Thr Thr 
^ u 125 130 135 

val cys Ala Glu Pro Gly Asp ser Gly Gly ser Tyr lie Ser Gly Asn 
140 145 150 

Gin Ala Gin Gly val Thr ser Gly Gly ser Gly Asn cys Arg Thr Gly 
155 160 165 

Gly Thr Thr Phe Tyr Gin Glu yal Thr Pro Met Val Asn Ser Trp Gly 
l / 0 175 180 

val Arg Leu Arg Thr ] 

185 ; 

<210> 42 

<211> 43 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 1603 

<400> 42 

gttcatcgat cgcatcggct gccaccggac cactccccca gtc 43 

<210> 43 

<211> 353 

<212> prt 

<213> Nocardiopsis sp. NRRL 18262 
<220> 

<221> PROPEP 

<222> (1) . . (165) 

<220> 

<221> mat_peptide 
<222> (166).. (1059) 

<400> 43 

Ala Thr Gly Ala Leu. Pro Gin Ser Pro Thr Pro Glu Ala Asp Ala 
XU3 -160 -155 

val ser Met Gin Glu Ala Leu Gin Arg Asp Leu Asp Leu Thr ser 
i:>u -145 -140 

Ala Glu Ala Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu 

-130 -125 

-120 Asp Glu Ala Ala -115 Glu Ala A1a Gly A g 0 Ala Tvr Gl v Gly 

Page 25 



WO 2004/111219 

PCT/DK2004/000431 

10423 . 204-WO. ST25 . txt 

ser val Phe Asp Thr Glu ser Leu Glu Leu Thr val Leu val Thr Asp 
— ~ 1UU -95 -90 

Ala Ala Ala val Glu Ala Val Glu Ala Thr Gly Ala Gly Thr Glu Leu 
-85 -80 -75 

val ser Tyr Gly lie Asp Gly Leu Asp Glu He val Gin Glu Leu Asn 
~ /u -65 -60 

Ala Ala Asp Ala val Pro Gly val val Gly Trp Tyr pro Asp val Ala 

-50 _45 

Gly Asp Thr val Val Leu Glu val Leu Glu Gly ser Gly Ala Asp val 
~ HU -35 -30 

ser Gly Leu Leu Ala Asp Ala Gly val Asp Ala Ser Ala Val Glu val 

-20 -15 _ 10 

Thr Thr ser Asp Gin Pro Glu Leu Tyr Ala Asp He He Gly Gly Leu 
-5 -11 5 

Ala Tyr Thr Met Gly Gly Arg cys Ser val Gly Phe Ala Ala Thr Asn 
1V 15 20 

Ala Ala Gly Gin Pro Gly Phe val Thr Ala Gly His Cys Gly Arg val 

Gly Thr Gin Val Thr lie Gly Asn Gly Arg Gly val Phe Glu Gin ser 
HU 45 50 55 

Val Phe Pro Gly Asn Asp Ala Ala Phe val Arg Gly Thr ser Asn Phe 
60 65 70 

Thr Leu Thr Asn Leu val ser Arg Tyr Asn Thr Gly Gly Tyr Ala Thr 

80 85 

val Ala Gly His Asn Gin Ala Pro He Gly ser ser val cys Arg ser 
yu 95 100 

Gly ser Thr Thr Gly Trp His Cys Gly Thr lie Gin Ala Arg Gly Gin 

Ser val ser Tyr Pro Glu Gly Thr Val Thr Asn Met Thr Arg Thr Thr 

125 130 135 

Val cys Ala Glu Pro Gly Asp ser Gly Gly ser Tyr He ser Gly Thr 

145 150 

Gin Ala Gin Gly Val Thr Ser Gly Gly ser Gly Asn Cys Arg Thr Gly 
J-" 160 165 
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Gly Thr Thr Phe Tyr Gin Glu Val Thr Pro Met val Asn ser Tro Glv 
170 175 180 

Val Arg Leu Arg Thr 
185 

<210> 44 

<211> 1164 

<212> DNA 

<213> artificial sequence 
<220> 

<223> Synthetic protease encoding gene 
<220> 

<221> CDS 

<222> (1)..(1164) 

<223> Full length protease 

<220> 

<221> sig_peptide 
<222> (1)..(81) 

<220> 

<2 2 1> mi s c_f eat u r e 

<222> (82) . . (1164) 

<223> Propeptide 

<220> 

<221> mat_peptide 
<222> (577).. (1164) 

<400> 44 

mo? ? aa £ cg ctg gga aaa a J t 9 tc 9ca age aca gca ctt ctt 45 

Met Lys Lys pro Leu Gly L ys lie val Ala sir Thr Ala Leu Leu 
-l»0 -185 -180 

att tea gtg gca ttt age tea tct att gca tea gca act aca aaa qn 
He ser va? Ala Phe ser ser ser lie Ala Ser Ala Ala Thr" G?y 
_175 -170 -165 

gca tta ccg cag tct ccg aca ccg gaa gca gat gca ate tea atn hi 
Ala Leu Pro Gin ser Pro Thr Pro Glu Ala Asp Ala val Ur Set 
" 160 -155 -150 

rin ?? a ?i a ? tg £ aa aga gat ctt gat ctt aca tea gca gaa gca 180 
Gin Glu Ala Leu Gin Arg Asp Leu Asp Leu Thr ser Ala Glu Ala 
-145 -140 _ 135 

?? a ? aa ? Zt ? tz g ? t g f a caa aat aca 9 ca ttt gaa gtg gat gaa 225 
Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu vaT Asp Glu 
_13 ° -125 -120 

?? a ?? g ?? a 5 aa g f a g ? a g ga gat gca tat ggc ggc tea gtt ttt 270 
Ala Ala Ala Glu Ala Ala Gly Asp Ala Tyr GTy cTy Ser val Phe 
-115 -110 -105 

gat aca gaa tea ctt gaa ctt aca gtt ctt gtt aca gat qca aca aca 31R 
Asp Thr Glu Q ser Leu Glu Leu Thr val Leu val Thr Asp Ala Ala Ala 

gtt gaa gca gtt gaa gca aca gga gca gga aca gta ctt att tea taf 3fifi 
val Glu Ala val Glu Ala Thr cTy Ala Gly Thr Val Leu val ser t?? 
-85 -80 -75 
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? at 2? c ? zt gat gaa a ? t gtt caa 9aa ctg aat gca get gat 414 
GTy He Asp cty Leu Asp Glu He Val Gin Glu Leu Asn Ala Ala Asp 
-70 -65 -60 -55 

get gtt ccg ggc gtt gtt ggc tgg tat ccg gat gtt get gga gat aca 462 
Ala val Pro Gly val val Gly Trp Tyr Pro Asp val Ala GTy Asp Th? 

-50 -45 -40 

gtt gtc ctt gaa gtt ctt gaa gga tea ggc gca gat gtt tea ggc eta 510 
val val Leu Glu val Leu Glu Gly Ser GTy Ala Asp val ser GTy llu 
-35 -30 -25 

ctg gca gac gca gga gtc gat gca tea gca gtt gaa gtt aca aca tea 558 
Leu Ala Asp Ala Gly val Asp Ala ser Ala val Glu val Thr Thr ser 
-20 -15 -10 

gat caa ccg gaa ctt tat gca gat att att ggc ggc ctg gca tat tat 606 
Asp Gin Pro Glu Leu Tyr Ala Asp He lie GTy GTy Leu Ala Tyr Tyr 
" 5 -11 5 10 

2? c 9$ c aga tgc agc gt ? ggc tt:t Q ca gca aca aat gca tea ggc 654 
Met Gly Gly Arg cys ser val GTy Phe Ala Ala Thr Asn Ala Ser GTy 
15 20 25 

caa ccg ggc ttt gtt aca gca ggc cat tgc ggc aca gtt ggc aca cca 702 
Gin Pro Gly Phe val Thr Ala cTy His Cys cTy Thr val GTy Thr Pro" 
30 35 40 

gtt tea att ggc aat ggc aaa ggc gtt ttt gaa cga agc att ttt cca 7<in 
val ser lie Gly Asn cTy Lys GTy val Phe Glu Arg llr lie Phe Prl 
45 50 55 

ggc aat gat tea gca ttt gtt aga ggc aca tea aat ttt aca ctt aca 7qr 
GTy Asn Asp ser Ala Phe val Arg cTy Thr ser III Phe Thr Leu Thr 
60 65 70 

aat ctg gtt tea aga tat aat tea ggc ggc tat gca aca gtt gca ggc 846 
Asn Leu val ser Arg Tyr Asn Ser GTy GTy Tyr Ala Thr val Ala gTv 
/:> 80 85 90 

cat aat caa gca ccg att ggc tea gca gtt tgc aga tea ggc tea aca 894 
His Asn Gin Ala Pro lie Gly Ser Ala Val cys Arg ser Gly ser Thr 
95 100 105 

aca ggc tgg cat tgc ggc aca att caa gca aga aat caa aca gtt aaa 942 
Thr GTy Trp His cys cTy Thr lie Gin Ala Arg Asn Gin Thr val Arg 
110 115 120 

tat ccg caa ggc aca gtt tat agt ctg aca aga aca aca gtt tgt gca 990 
Tyr Pro Gin cTy Thr val Tyr ser LeS Thr Arg Thr Thr val cys Ala 
125 130 135 

gaa ccg ggc gat tea ggc ggc tea tat att agc ggc act caa gca caa 1038 
Glu Pro GTy Asp ser cTy cTy ser Tyr lie sir cTy Thr Gin Ala Gin 
140 145 150 

P?w S3? ?u a £ ca ?? c ggc tca ggc aat t 9 c a 9 t 9 ct 9gc ggc aca aca 1086 
GTy val Thr Ser cTy Gly ser cTy Asn Cys Ser Ala Gly GTy Thr Thr 
155 160 165 170 

tat tac caa gaa gtt aat ccg atg ctt agt tca tgg ggc ctt aca ctt 1134 
Tyr Tyr Gin Glu Val Asn Pro Met Leu ser Ser Trp GTy Leu Thr Leu 
17 5 180 185 



aga aca caa teg cat gtt caa tec get cca 
Arg Thr Gin ser His Val Gin ser Ala Pro 
190 195 



1164 
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<210> 45 
<211> 388 
<212> PRT 

<213> artificial sequence 
<220> 

<223> Synthetic Construct 
<400> 45 

Met Lys Lys Pro Leu Gly Lys lie val Ala Ser Thr Ala Leu Leu 
"J-yo -185 -180 

He ser val Ala Phe Ser Ser ser He Ala Ser Ala Ala Thr Gly 
-1/5 -170 _1 65 

Ala Leu Pro Gin Ser Pro Thr Pro Glu Ala Asp Ala Val ser Met 
-J-ou -155 -150 

Gin Glu Ala Leu Gin Arg Asp Leu Asp Leu Thr Ser Ala Glu Ala 
-A4:> -140 -135 

Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu val Asp Glu 

-125 -120 

Ala Ala Ala Glu Ala Ala Gly Asp Ala Tyr Gly Gly Ser Val Phe 
-J - L - > -110 -105 

Asp Thr Glu ser Leu Glu Leu Thr val Leu val Thr Asp Ala Ala Al; 
-j-uu -95 -90 

val Glu Ala val Glu Ala Thr Gly Ala Gly Thr val Leu val Ser Tyr 
~°-> -80 -75 

Gly He Asp Gly Leu Asp Glu He Val Gin Glu Leu Asn Ala Ala Asp 
/u ~ 65 -60 -55 

Ala val Pro Gly val Val Gly Trp Tyr Pro Asp val Ala Gly Asp Thr 
"50 -45 -40 

val val Leu Glu val Leu Glu Gly ser Gly Ala Asp val ser Gly Leu 
-« -30 _25 

Leu Ala Asp Ala Gly val Asp Ala ser Ala val Glu Val Thr Thr Ser 
-«U -15 -io 

Asp Gin Pro Glu Leu Tyr Ala Asp He He Gly Gly Leu Ala Tyr Tyr 

-11 5 10 

Met Gly Gly Arg cys ser Val Gly phe Ala Ala Thr Asn Ala ser Gly 

20 25 
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Gin Pro Gly Phe val Thr Ala Gly His Cys Gly Thr val Gly Thr Pro 
30 35 40 

val ser lie Gly Asn Gly Lys Gly val Phe Glu Arg ser He Phe Pro 
45 50 55 

Gly Asn Asp ser Ala Phe Val Arg Gly Thr ser Asn Phe Thr Leu Thr 
60 65 70 

Asn Leu val ser Arg Tyr Asn Ser Gly Gly Tyr Ala Thr val Ala Gly 
75 80 85 90 

His Asn Gin Ala Pro lie Gly ser Ala Val cys Arg ser Gly ser Thr 
95 100 105 

Thr Gly Trp His Cys Gly Thr lie Gin Ala Arg Asn Gin Thr val Arg 
110 115 120 

Tyr pro Gin Gly Thr val Tyr Ser Leu Thr Arg Thr Thr val cys Ala 
125 130 135 

Glu Pro Gly Asp ser Gly Gly Ser Tyr lie Ser Gly Thr Gin Ala Gin 
140 145 150 

Gly val Thr ser Gly Gly ser Gly Asn Cys Ser Ala Gly Gly Thr Thr 
155 160 165 170 

Tyr Tyr Gin Glu val Asn Pro Met Leu Ser ser Trp Gly Leu Thr Leu 
175 180 185 

Arg Thr Gin ser His val Gin Ser Ala Pro 
190 195 

<210> 46 
<211> 165 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> shuffled pro-peptide 0-2.19 
<220> 

<221> PROPEP 
<222> CD • • (165) 

<400> 46 

Ala Thr Gly Ala Leu Pro Gin Ser Pro Thr Pro Glu Ala Asp Ala Val 
15 10 15 

ser Met Gin Glu Ala Leu Gin Arg Asp Leu Asp Leu Thr ser Ala Glu 
20 25 30 

Ala Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu val Asp Glu 
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35 40 45 

Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly ser val Phe Asp 
3U 55 60 

Thr Glu ser Leu Thr Leu Thr val Leu val Thr Asp Ala ser Ala Val 

70 75 80 

Glu Ala val Glu Ala Ala Gly Ala Glu Ala Lys Val val Ser His Glv 

85 90 g 5 * 

Met Glu Gly Leu Glu Glu He val Ala Asp Leu Asn Ala Ala Asp Ala 
100 105 110 

Gin Pro Gly val Val Gly Trp Tyr pro Asp He His Ser Asp Thr val 

120 125 

val Leu Glu val Leu Glu Gly ser Gly Ala Asp val Asp ser Leu Leu 

135 140 

Ala Asp Ala Gly Val Asp Ala ser Ala val Glu val Thr Thr ser Asp 



160 



Gin Pro Glu Leu Tyr 
165 



<210> 47 

<211> 166 

<212> prt 

<213> Artificial sequence 
<220> 

<223> Shuffled propeptide G-2.73 
<220> 

<221> PROPEP 

<222> CD . . C166) 

<400> 47 



Ala Thr Gly Ala Leu Pro Gin ser Pro Thr Pro Glu Ala Asp Ala Val 

10 15 

ser Met Gin Glu Ala Leu Gin Arg Asp Leu Asp Leu ser ser Ala Glu 
*° 25 30 

Ala Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala phe Glu Val Asp Glu 
j-> 40 45 

Ala Ala Ala Gly Ala Ala Gly Asp Ala Tyr Gly Gly ser Val Phe Asp 
DKJ ->■> 60 

Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp Ala ser Ala Val 

/0 75 80 
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Glu Ala val Glu Ala Ala Gly Ala Glu Ala Lys Val val ser His Gly 
85 90 95 

Met Glu Gly Leu Glu Glu He val Ala Asp Leu Asn Ala Ala Asp Ala 
100 105 110 

Gin Pro Gly val val Gly Trp Tyr Pro Asp lie His Ser Asp Thr val 
115 120 125 

Va1 Yll Glu va1 Leu Glu Gl y ser Gly Ala Asp val Asp Ser Leu Leu 
130 135 140 

Ala Asp Ala Gly Val Asp Thr Ala Asp val Lys val Glu Ser Thr Thr 
X4:> 150 155 160 

Glu Gin Pro Glu Leu Tyr 
165 

<210> 48 
<211> 166 
<212> prt 

<213> Artificial sequence 
<220> 

<223> shuffled propeptide G-1.43 
<220> 

<221> PROPEP 
<222> (1) . . (166) 

<400> 48 

Ala Thr Gly Ala Leu Pro Gin ser Pro Thr pro Glu Ala Asp Ala Val 
1 5 10 15 

ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu ser ser ser Gin 
20 25 30 

Ala Glu Glu Leu Leu Asp Ala Gin Ala Glu Ser Phe Glu lie Asp Glu 
35 40 45 

Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly Ser He Phe Asp 
jO 55 60 

Thr Asp ser Leu Thr Leu Thr val Leu val Thr Asp Ala Ser Ala val 
b * 70 75 80 

Glu Ala val Glu Ala Ala Gly Ala Glu Ala Lys val val ser His Glv 
85 90 95 

Met Glu Gly Leu Glu Glu He Val Ala Asp Leu Asn Ala Ala Asp Ala 
100 105 110 
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Gin Pro Gly val val Gly Trp Tyr Pro Asp lie His ser Asp Thr val 
xxi 120 12 5 

val Leu Glu val Leu Glu Gly ser Gly Ala Asp val Asp ser Leu Leu 
Aa v 135 140 

Ala Asp Ala Gly val Asp Thr Ala Asp Val Lys val Glu ser Thr Thr 

155 ISO 



160 

Glu Gin Pro Glu Leu Tyr 
165 

<210> 49 
<211> 166 
<212> prt 

<213> Artificial sequence 
<220> 

<223> shuffled propeptide G-2.6 
<220> 

<221> propep 
<222> (1) . . (166) 

<400> 49 

Ala Thr Gly Ala Leu Pro Gin ser Pro Thr Pro Glu Ala Asp Ala val 
» 10 15 

ser Met Gin Glu Ala Leu Gin Arg Asp Leu Asp Leu Thr ser Ala Glu 

25 30 

Ala Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu val Asp Glu 
DD 40 45 

Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly Ser He Phe Asp 

« 60 

Thr Glu Thr Leu Glu Leu Thr val Leu Val Thr Asp ser ser Ser val 

/u 75 80 

Glu Ala val Glu Ala Ala Gly Ala Glu Ala Lys val val Ser His Gly 
o5 90 95 * 

Met Glu Gly Leu Glu Glu He Val Ala Asp Leu Asn Ala Ala Asp Ala 
uu 105 lio 

Gin Pro Gly val val Gly Trp Tyr Pro Asp He His ser Asp Thr val 

120 125 

val Leu Glu val Leu Glu Gly ser Gly Ala Asp val Asp ser Leu Leu 

«■» 140 
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Ala Gly Ala Gly val Asp Thr AlaTsp vaVTy/vlVciu ser Thr Thr 

150 155 160 

Glu Gin Pro Glu Leu Tyr 
165 

<210> 50 
<211> 165 
<212> prt 

<213> Artificial sequence 
<220> 

<223> Shuffled propeptide G-2.5 
<220> 

<221> PROPEP 
<222> (1) . . (165) 

<400> 50 

Ala Thr Gly Ala Leu Pro Gin ser Pro Thr pro Glu Ala Asp Ala Val 
5 10 15 

ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu Thr Pro Leu Glu 
/u 25 30 

Ala Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu Val Asp Glu 

Ala Ala Ala Glu Ala Ala Gly Asp Ala Tyr Gly Gly ser Val Phe Asp 
Jyj ->-> 60 

Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp Ala ser Ala val 

^5 80 

Glu Ala val Glu Ala Ala Gly Ala Glu Ala Lys val val ser His Gly 
85 90 95 3 

Met Glu Gly Leu Glu Glu He val Ala Asp Leu Asn Ala Ala Asp Ala 



110 



Gin Pro Gly val val Gly Trp Tyr Pro Asp lie His ser Asp Thr Val 
■ LA:> 120 125 

val Leu Glu val Leu Glu Gly Ser Gly Ala Asp val Asp ser Leu Leu 

155 140 

Ala Asp Ala Gly val Asp Ala ser Ala Val Glu val Thr Pro Ala Ala 

155 160 

Arg pro Glu Leu Tyr 
165 

<210> 51 
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<212> PRT 

<213> Artificial sequence 
<220> 

<223> Shuffled propeptide G-2.3 
<220> 

<221> PROPEP 

<222> CD . . (166) 

<400> 51 

Ala Thr Gly Ala Leu Pro Gin Ser pro Thr pro Asp Gly Ala Glu Ala 
1 5 10 15 

Thr Thr Met val Glu Ala Leu Gin Arg Asp Leu Gly Leu Thr Pro Ala 
20 25 30 

Glu Ala Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu Val Asp 
35 40 45 

Glu Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly Ser lie Phe 
50 55 60 

Asp Thr Asp ser Leu Thr Leu Thr Val Leu Val Thr Asp Ala Ala Ala 
65 70 75 80 

val Glu Ala val Glu Ala Ala Gly Ala Glu Ala Lys val val ser His 
85 90 95 

Gly Met Glu Gly Leu Glu Glu lie Val Ala Asp Leu Asn Ala Ala Asp 
100 105 110 

Ala val Pro Gly val Val Gly Trp Tyr Pro Asp val Ala Gly Asp Thr 
115 120 125 

Va1 Leu Glu val Leu Glu Gl y Ser Gly Ala Asp val Tyr Ser Leu 
13° 135 140 

Leu Ala Asp Ala Gly Val Asp Ala Ser Ala val Glu val Thr pro Ala 
145 150 155 160 

Ala Gin Pro Glu Leu Tyr 
165 

<210> 52 
<211> 166 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> shuffled propeptide G-1.4 
<220> 
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<400> 52 

Ala Thr Gly Ala Leu Pro Gin Ser Pro Thr Pro Glu Ala Asp Ala val 

10 15 

Ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu ser Ser Ser Gin 
20 25 30 

Ala Glu Glu Leu Leu Asp Ala Gin Ala Glu Ser Phe Glu He Asp Glu 
33 40 45 

Ala Ala Ala Ala Ala Ala Ala Asp ser Tyr Gly Gly ser He Phe Asp 
-* u 55 60 

Thr Asp ser Leu Thr Leu Thr val Leu Val Thr Asp Ala Ser Ala Val 

/0 75 80 

Glu Ala val Glu Ala Ala Gly Ala Glu Ala Lys val val Ser His Gly 
85 90 95 

Met Glu Gly Leu Glu Glu He Val Ala Asp Leu Asn Ala Ala Asp Ala 
1UU 105 110 

Gin Pro Gly val Val Gly Trp Tyr Pro Asp lie His Ser Asp Thr val 
±x * 120 125 

val Leu Glu val Leu Glu Gly ser Gly Ala Asp val Asp Ser Leu Leu 
J " ,u 135 140 

Ala Asp Ala Gly val Asp Thr Ala Asp val Lys val Glu ser Thr Thr 
n 150 155 160 

Glu Gin Pro Glu Leu Tyr 
165 

<210> 53 
<211> 166 
<212> prt 

<213> Artificial sequence 
<220> 

<223> Shuffled propeptide G-1.2 
<220> 

<221> PROPEP 
<222> (1) . . (166) 



<400> 53 

Ala Thr Gly Ala Leu Pro Gin ser Pro Thr Pro Glu Ala Asp Ala Val 
5 10 15 
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Ser Met Gin Glu Ala Leu Gin Arg Asp Leu Asp Leu Thr ser Ala Glu 
20 25 30 

Ala Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu val Asp Glu 
35 40 45 

Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly ser He Phe Asp 
50 55 60 

Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp ser ser ser val 
65 70 75 80 

Glu Ala val Glu Ala Ala Gly Ala Glu Ala Lys val Val ser His Glv 
85 90 95 ' 

Met Glu Gly Leu Glu Glu He Val Ala Asp Leu Asn Ala Ala Asp Ala 
100 105 HO 

Gin Pro Gly Val Val Gly Trp Tyr Pro Asp He His ser Asp Thr Val 
115 120 125 

Val Leu Glu val Leu Glu Gly ser Gly Ala Asp val Asp ser Leu Leu 
J-su 135 240 

Ala Gly Ala Gly Val Asp Thr Ala Asp val Lys val Glu ser Thr Thr 
w 150 155 160 

Glu Gin Pro Glu Leu Tyr 
165 
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